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Using the theory of large deviations, we analyze the phase transition structure of the Curie- Weiss- 
Potts spin model, which is a mean-field approximation to the Potts model. This analysis is carried out 
both for the canonical ensemble and the microcanonical ensemble. Besides giving explicit formulas 
for the microcanonical entropy and for the equilibrium macrostates with respect to the two ensem- 
bles, we analyze ensemble equivalence and nonequivalence at the level of equilibrium macrostates, 
relating these to concavity and support properties of the microcanonical entropy. The Curie-Weiss- 
Potts model is the first statistical mechanical model for which such a detailed and rigorous analysis 
has been carried out. 
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I. INTRODUCTION 



The nearest-neighbor Potts model, introduced in [40], takes its place next to the Ising model as one of 
the most versatile models in equilibrium statistical mechanics 14911 . Section I.C of [49] presents a mean- 
field approximation to the Potts model, defined in terms of a mean interaction averaged over all the sites 
in the model. We refer to this approximation as the Curie-Weiss-Potts model. Both the nearest-neighbor 
Potts model and the Curie-Weiss-Potts model are defined by sequences of probability distributions of n 
spin random variables that may occupy one of q different states 9 , . . . , 6 q , where q > 3. For q = 2 the 
Potts model reduces to the Ising model while the Curie-Weiss-Potts model reduces to the much simpler 
mean-field approximation to the Ising model known as the Curie-Weiss model 11411 . 

Two ways in which the Curie-Weiss-Potts model approximates the Potts model, and in fact gives rigorous 
bounds on quantities in the Potts model, are discussed in [31] and [39]. Probabilistic limit theorems for the 
Curie-Weiss-Potts model are proved in fpjll . including the law of large numbers and its breakdown as well 
as various types of central limit theorems. The model is also studied in [20], which focuses on a statistical 
estimation problem for two parameters defining the model. 

In order to carry out the analysis of the model in IU9L I20I1 . detailed information about the structure of 
the set of canonical equilibrium macrostates is required, including the fact that it exhibits a discontinuous 
phase transition as the inverse temperature (3 increases through a critical value (3 C . This information plays 
a central role in the present paper, in which we use the theory of large deviations to study the equivalence 
and nonequivalence of the sets of equilibrium macrostates for the microcanonical and canonical ensembles. 
An important consequence of the discontinuous phase transition exhibited by the canonical ensemble in the 
Curie-Weiss-Potts model is the implication that the nearest-neighbor Potts model on Z d also undergoes a 
discontinuous phase transition whenever d is sufficiently large [4, Thm. 2.1]. 

In [15] the problem of the equivalence of the microcanonical and canonical ensembles was completely 
solved for a general class of statistical mechanical models including short-range and long-range spin models 
and models of turbulence. This problem is fundamental in statistical mechanics because it focuses on the 
appropriate probabilistic description of statistical mechanical systems. While the theory developed in fl5ll 
is complete, our understanding is greatly enhanced by the insights obtained from studying specific models. 
In this regard the Curie-Weiss-Potts model is an excellent choice, lying at the boundary of the set of models 
for which a complete analysis involving explicit formulas is available. 

For the Curie-Weiss-Potts model ensemble equivalence at the thermodynamic level is studied numer- 
ically in [29, §3-5]. This level of ensemble equivalence focuses on whether the microcanonical entropy 
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is concave on its domain; equivalently, whether the microcanonical entropy and the canonical free energy, 
the basic thermodynamic functions in the two ensembles, can each be expressed as the Legendre-Fenchel 
transform of the other [ 15, pp. 1036-1037]. Nonconcave anomalies in the microcanonical entropy partially 
correspond to regions of negative specific heat and thus thermodynamic instability. 

The present paper significantly extends I29L §3-5] by analyzing rigorously ensemble equivalence at the 
thermodynamic level and by relating it to ensemble equivalence at the level of equilibrium macrostates via 
the results in lUill . As prescribed by the theory of large deviations, the set £ u of microcanonical equilibrium 
macrostates and the set £ g of canonical equilibrium macrostates are defined in ( 12.41) and d2.3t . These 
macrostates are, respectively, the solutions of a constrained minimization problem involving probability 
vectors on M q and a related, unconstrained minimization problem. The equilibrium macrostates for the two 
ensembles are probability vectors describing equilibrium configurations of the model in each ensemble in 
the thermodynamic limit n — > 00. For each i = 1, 2, . . . , q, the ith component of an equilibrium macrostate 
gives the asymptotic relative frequency of spins taking the spin- value 9 l . 

Defined via conditioning on h n , the microcanonical ensemble expresses the conservation of physical 
quantities such as the energy. Among other reasons, the mathematically more tractable canonical ensemble 
was introduced by Gibbs 12211 in the hope that in the n — > 00 limit the two ensembles are equivalent; i.e., all 
asymptotic properties of the model obtained via the microcanonical ensemble could be realized as asymp- 
totic properties obtained via the canonical ensemble. Although most textbooks in statistical mechanics, 
including ll lL I22L l28l B5L 14 lL 14411 . claim that the two ensembles always give the same predictions, in general 
this is not the case [48]. There are many examples of statistical mechanical models for which nonequiva- 
lence of ensembles holds over a wide range of model parameters and for which physically interesting mi- 
crocanonical equilibria are often omitted by the canonical ensemble. Besides the Curie-Weiss-Potts model, 
these models include the mean-field Blume-Emery-Griffifhs model I2L 3[|l8||. the Hamiltonian mean-field 
model [12, 36], the mean-field X-Y model [11], models of turbulence [6, 16,21, 00] , models of plasmas 
J34II45I1 . gravitational systems J23, 24, 25l I37LI47I1 . and a model of the Lennard- Jones gas 0]. It is hoped 
that our detailed analysis of ensemble nonequivalence in the Curie-Weiss-Potts model will contribute to an 
understanding of this fascinating and fundamental phenomenon in a wide range of other settings. 

In the present paper, after summarizing the large deviation analysis of the Curie-Weiss-Potts model in 
Section 2, we give explicit formulas for the elements of £g and the elements of £ u in Sections 3 and 4. This 
analysis shows that £g exhibits a discontinuous phase transition at a critical inverse temperature f3 c and that 
£ u exhibits a continuous phase transition at a critical mean energy u c . The implications of these different 
phase transitions concerning ensemble nonequivalence are studied graphically in Section 5 and rigorously in 
Section 6, where we exhibit a range of values of the mean energy for which the microcanonical equilibrium 
macrostates are not realized canonically. As described in the main theorem in [ 150 and summarized here 
in Theorem 15. 11 this range of values of the mean energy is precisely the set on which the microcanonical 
entropy is not concave. The analysis of this bridge between ensemble nonequivalence at the thermodynamic 
level and ensemble nonequivalence at the level of equilibrium macrostates is one of the main contributions 

J _l 

of [15] for general models and of the present paper for the Curie-Weiss-Potts model. In a sequel to the 
present paper 1911, we will extend our analysis of the Curie-Weiss-Potts model to the so-called Gaussian 
ensemble 0, l8l 126 , 27 , 3(J 3l to show, among other things, that for each value of the mean energy for 
which the microcanonical and canonical ensembles are nonequivalent, we can find a Gaussian ensemble 
that is fully equivalent with the microcanonical ensemble flfll . 



II. SETS OF EQUILIBRIUM MACROSTATES FOR THE TWO ENSEMBLES 

Let q > 3 be a fixed integer and define A = {9 1 , 9 2 , . . . , 9 q }, where the 9 % are any q distinct vectors 
in M q . In the definition of the Curie-Weiss-Potts model, the precise values of these vectors is immaterial. 
For each n € IN the model is defined by spin random variables u>i, ■ ■ ■ , oj n that take values in A. The 
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canonical and microcanonical ensembles for the model are defined in terms of probability measures on the 
configuration spaces A n , which consist of the microstates lo = (ui, ...,uj n ). We also introduce the n-fold 
product measure P n on A n with identical one-dimensional marginals 



1 9 
p = - ^Sgi. 



?i=i 

Thus for all oj G A n , P n (u) = For n G IN and u G A ra the Hamiltonian for the g-state Curie-Weiss- 
Potts model is defined by 

1 n 

ln j,k=l 

where 8(uj,Uk) equals 1 if loj = oj^ and equals otherwise. The energy per particle is defined by 

For inverse temperature (3 G M and subsets B of A n the canonical ensemble is the probability measure 
P n fi defined by 

PnA B ) = ^ 7 n, , w ■ exp[-nPhn(u>)]. 

For mean energy u G M and r > the microcanonical ensemble is the conditioned probability measure 
Py defined by 

P^ r {B] = P n {B \h n e[u-r,u + r}}. 

The key to our analysis of the Curie-Weiss-Potts model is to express both the canonical and the microcanon- 
ical ensembles in terms of the empirical vector 

L n = L n (u) = (L njl (u), L Hj2 {uj), L n!q (u)), 

the 2th component of which is defined by 

1 " 

This quantity equals the relative frequency with which Uj, j G {1, . . . , n}, equals 9 l . L n takes values in the 
set of probability vectors 

V = |f G M q : v = (ui, U2, . . ■ , v q ), each Uj > 0, ^ Vj = l| . 

As we will see, each probability vector in V represents a possible equilibrium macrostate for the model. 

There is a one-to-one correspondence between V and the set V(A) of probability measures on A, v G 
V corresponding to the probability measure Yh=i v i^e i - The element p G V corresponding to the one- 
dimensional marginal p of the prior measures P n is the uniform vector having equal components |. 

We denote by (•, •) the inner product on M q . Since 

g n n n 



i=l j=l k=l j,k=l 
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2(L n (u),L n (u))), 
i.e., 

hn(u) = H(L n (uj)), where H(u) = -\{v, v) for v G V. (2.1) 

We call H the energy representation function. 

We appeal to the theory of large deviations to define the sets of microcanonical equilibrium macrostates 
and canonical equilibrium macrostates. Sanov's Theorem states that with respect to the product measures 
P n , the empirical vectors L n satisfy the large deviation principle (LDP) on V with rate function given by 
the relative entropy R(-\p) 1 14, Thm. VIII.2.1]. For v G V this is defined by 

g 

i=l 

We express this LDP by the formal notation P n {L n G dv} exp[— nR{v\p)\. The LDPs for L n with 
respect to the two ensembles P n a and P^ ,r in the thermodynamic limit n — > oo, r — > can be proved 
from the LDP for the P n -distributions of L n as in Theorems 2.4 and 3.2 in , in which minor notational 
changes have to be made. We express these LDPs by the formal notation 

PnA L n G dv} « exp[-nl/3(v)] and P^ r {L n G dv} « exp[-n/ u (>)], (2.2) 
where for v G V 

Ip{v) = R{v\p) — f (v, v) — const 

and 

j*( v \ _ / R ( U \P) ~ const if =u 
I oo otherwise. 

The constants appearing in the definitions of la and I u have the properties that infjyg-p Ip{v) = and 
inf^ g p I u {v) = 0. Thus la and I u map "P into [0, oo). 

As the formulas in (I2.2t suggest, if J^g (i/) > or I u (u) > 0, then v has an exponentially small proba- 
bility of being observed in the corresponding ensemble in the thermodynamic limit. Hence it makes sense 
to define the corresponding sets of equilibrium macrostates to be 

£ p = {veV: Ip{v) = 0} and £ u = {v G V : I u {v) = 0}. 

A rigorous justification for this is given in fl5l Thm. 2.4(d)]. Using the formulas for la and I u , we see that 

£p = (u G V : v minimizes R(y\p) — ^(u, u}\ (2.3) 

and 

£ u = |z/ G V : v minimizes R(y\p) subject to — \(y, v) = uX . (2.4) 

Each element v in £p and £ u describes an equilibrium configuration of the model in the corresponding 
ensemble in the thermodynamic limit. The ith component vi gives the asymptotic relative frequency of 
spins taking the value 9 i . 

The question of equivalence of ensembles at the level of equilibrium macrostates focuses on the rela- 
tionships between £ u , defined in terms of the constrained minimization problem in (12.4b . and £p, defined 
in terms of the related, unconstrained minimization problem in d2.3t . We will focus on this question in 
Sections 5 and 6 after we determine the structures of £p and £ u in the next two sections. 



it follows that the energy per particle can be rewritten as 

1 n 

h n (uj) = ~2~2 S^i^k) 

j,k=l 
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III. FORM OF £p AND ITS DISCONTINUOUS PHASE TRANSITION 

In this section we derive the form of the set £p of canonical equilibrium macrostates for all j3 G JR. This 
form is given in Theorem 13.11 which shows that with respect to the canonical ensemble the Curie-Weiss- 
Potts model undergoes a discontinuous phase transition at the critical inverse temperature 

A= = ^5^1og(g-l). (3.1) 

q z 

In order to describe the form of £p, we introduce the function ip that maps [0, 1] into V and is defined by 

,/ v fl + (q-l)w l-w l-w\ 

iP(w)=[ , ; (3.2) 

\ q q q J 

the last q—1 components all equal Recalling that p is the uniform vector in V having equal components 
-, we see that p = V'(O). 

Theorem 3.1. For (3 > let w(/3) be the largest solution of the equation 

1 _ e ~P™ 

W = ; r 3—. (3.3) 

1 + (q - l)e-P w 

The following conclusions hold. 

(a) The quantity w{(5) is well defined and lies in [0, 1]. It is positive, strictly increasing, and differentiable 
for [3 £ (P c , oo) and satisfies w{!3 c ) = an d rrm^oo w(f3) = 1. 

(b) For (3 > (3 C , define v^~{fi) = tp(w((3)) and let z^(/3), i = 2, . . . , q, denote the points in lR q obtained 
by interchanging the first and ith components of v l ((3). Then the set £p defined in (I2.3t has the form 

f {p} for (3<f3 c , 

£p = \ {p, u\l3 c ), v\(3 c ), iA{p c )} for (3 = (3 C , (3.4) 
[ {v\i3),v 2 03),..., !/>(/?)} for (3>/3 c . 

For (3 > P c , the vectors in £p are all distinct and each v l {(3) is continuous. The vector v l {(3 c ) is given by 

u\f3 c ) = ^ W {13 C )) = ^(fEf) = (l " J, - , ^FIj) ; (3-5) 

the last q — 1 components all equal q ^\y 

The form of £p for (3 > is proved in Appendix B from a new convex-duality theorem proved in 
Appendix A and from the complicated calculation of the global minimum points of a related function given 
in Theorem 2.1 in [ 19]. The form of £p for (3 < is also determined in Appendix B. 

For (3 > the form of £p reflects a competition between disorder, as represented by the relative entropy 
R(u\p), and order, as represented by the energy representation function — \(y, v). For small /3 > 0, R(y\p) 
predominates. Since R(y\p) attains its minimum of at the unique vector p, we expect that for small (3, £p 
should contain a single vector. On the other hand, for large (3 > 0, — \(y, v) predominates. This function 
attains its minimum at z^ 1 = (1,0,..., 0) and at the vectors u l , i = 1, . . . , q, obtained by interchanging the 
first and ith components of v . Hence we expect that for large (3, £p should contain q distinct vectors v l ((3) 
having the property that v l (f3) — > v % as — > oo. The major surprise of the theorem is that for (3 = (3 C , £p 
consists of the q + 1 distinct vectors p and v l {(3 c ) for % = 1, 2, . . . , q. 

The discontinuous bifurcation in the composition of £p from 1 vector for j3 < (3 C to q + 1 vectors 
for j3 = p c to q vectors for P > p c corresponds to a discontinuous phase transition exhibited by the 
canonical ensemble. In Figure 2 in Section 5 this phase transition is shown together with the continuous 
phase transition exhibited by the microcanonical ensemble. The latter phase transition and the form of the 
set of microcanonical equilibrium macrostates are the focus of the next section. 
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IV. FORM OF E u AND ITS CONTINUOUS PHASE TRANSITION 

We now turn to the form of the set £ u for all u G [— \ , — ^r], which is the set of u for which £ u is 
nonempty. In the specific case q = 3 part (c) of Theorem l4.2| gives the form of £ u , the calculation of which 
is much simpler than the calculation of the form of £p. The proof is based on the method of Lagrange 
multipliers, which also works for general q > 4 provided the next conjecture on the form of the elements in 
£ u is valid. The validity of this conjecture has been confirmed numerically for all g€ {4,5,..., 10 4 } and 
all u G (—5, — ^) of the form u = — \ + 0.02A;, where k is a positive integer. 

Conjecture 4.1. For any q > 4 and all u G (— 5, — ^r), f/iere ejrfjfj o / d 6 (0, 1) swc/i f/iaf modulo 
permutations, any v G £" /jo? the form (a, 6, ... , 6); f/ie Zasf (/ — 1 components of which all equal b. 

Parts (a) and (b) of Theorem l4.2l are proved for general q > 3. Part (c) shows that modulo permutations, 
for q = 3, v G £" has the form (a(ii), a(/u), &(«)) and determines the precise formulas for a(u) and 
As specified in part (d), for q > 4 we can also determine the precise formula for v € £ u provided Conjecture 
l4~Tl is valid. 

Theorem 14.21 shows that with respect to the microcanonical ensemble the Curie-Weiss-Potts model un- 
dergoes a continuous phase transition as u decreases from the critical mean-energy value u c = — This 
contrast with the discontinuous phase transition exhibited by the canonical ensemble is closely related to 
the nonequivalence of the microcanonical and canonical ensembles for a range of u. Ensemble equivalence 
and nonequivalence will be explored in the next section, where we will see that it is reflected by support and 
concavity properties of the microcanonical entropy. An explicit formula for the microcanonical entropy is 
given in Theorem l4.3l 

Theorem 4.2. For u G JRwe define £ u by MAX . The following conclusions hold. 

(a) For any q > 3, £ u is nonempty if and only if u G [— ^, — This interval coincides with the range 
of the energy representation function H(v) = —\(y, v) on V. 

(b) For any q > 3, £~^ = {p} = {(J, \, . . . \)} and 

e~* = {(1,0,...,0) ) (0 ) 1,...,0),...,(0,0,...,1)}. 

(c) Let q = 3. For u G ( — f ^ U cons ^ sts of the 3 distinct vectors {v l {u), v 2 {u), u 3 (u)}, where 
v x {u) = (a(u),b(u),b(u)), 

1 + y/2(-6n - 1) 2 - v /2(-6n - 1) 

a(u) = and b(u) = . (4.1) 

3 6 

The vectors v 2 (u) and u 3 (u) denote the points in M 3 obtained by interchanging the first and the ith com- 
ponents of ^(u). 

(d) Let q > 4 and assume that Conjecture 14. 1 1 is valid. Then for u G (— ^ , — £ u consists of the q 
distinct vectors {v l {u) : . . . , u q (u)}, where ^(u) = (a(u), b(u), . . . , b(u)), 

l + y/(g-l)(-2^-l) g-1- ^( q -l)(-2qu-l) 
aiu) = and blu) = -. r . 

q (g-i)g 

The last q — 1 components ofv l (u) all equal b(u), and the vectors v l (u),i = 2, . . . , q, denote the points in 
M q obtained by interchanging the first and the ith components ofv 1 ^). 
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We return to part (b) of Theorem l4.2l in order to discuss the nature of the phase transition exhibited by the 
microcanonical ensemble. The functions a(u) and b(u) given in (I4.lt are both continuous for u G [— |, — ^] 
and satisfy 

lim a(u) = lim b(u) = \ = a(-i) = b(-^). 
Therefore, for i = 1, . . . , q, lim u ^ i_ v % {u) = p. It follows that the microcanonical ensemble exhibits a 

2q 

continuous phase transition as u decreases from u c = — the unique equilibrium macrostate p for u = u c 

bifurcating continuously into the q distinct macrostates v^ 1 ' (u) as u decreases from its maximum value. 
This is rigorously true for q = 3. Provided Conjecture 14. H is true, it is also true for q > 4, as one easily 
checks using part (d) of Theorem l4.2l 

Before proving Theorem l4.2l we introduce the microcanonical entropy 

s(u) = - M{R(y\p) :J/G?,-i(i/,y) = u}. (4.2) 

As we will see in the next section, this function plays a crucial role in the analysis of ensemble equivalence 
and nonequivalence for the Curie-Weiss-Potts model. Since < R{v\p) for all u G V, s{u) G [— oo,0] 
for all u, and since R{v\p) > R{p\p) = for all v / p, s attains its maximum of at the unique value 

~55 = -3<P.P>- 

The domain of s is defined by doms = {u G M : s(u) > — oo}. Since R{v\p) < oo for all v G V, 
doms equals the range of H(v) = —\{v, v) on V, which is the interval [— 1, — rThm. l4~2T a)l. As we 
have seen, s(— ^) = 0. For u G (— |, — ^), according to parts (c)-(d) of Theorem 14.21 £ u consists of the 

unique vector z/W(it) modulo permutations. Since for i = 2, 3, ... ,q, R(u^\u)\p) = R(i/^\u)\p), we 
conclude that 

s(u) = — R^u^ (u)\p) = —a(u)\og(qa(u)) — (q — l)b(u) \og(q b(u)) . 

Finally, for u = — |, modulo permutations £ u consists of the unique vector (1,0,..., 0) [see (I4.7I) 1. and so 
s(— |) = —R((l, 0, . . . , 0)\p) = — log q. The resulting formulas for s(u) are recorded in the next theorem, 
where we distinguish between q = 3 and q > 4. 

Theorem 4.3. We define the microcanonical entropy s(u) in \A.2\ . The following conclusions hold. 

(a) dom s = [— |, — ^];for any u G doms, u / — s(it) < s(— = 0; and s(— ^) = — logg. 

(b) Letq = 3. Then for u£ (-3,-25) = (-3,-5) 



= _ 1 + V2( 6^ 1) bg x + gu _ (43) 



y/2(-6u - 1) / 2- y/2(-6u-l) 



(c) Le? q > 4 amJ assume that Coniecture W.W is valid. Then for u G (- 1 1 



2' 2g; 



-(«) = - i + ^-iK^-D . log ^ + y^rr^rry) (4.4) 

q-l-V(q- l)(-2qu - 1) / g - 1 - yjg - l)(-2 gM - 1) \ 
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We now turn to the proof of Theorem l4.2l which gives the form of £ u . We start by proving part (a). The 
set £ u of microcanonical equilibrium macrostates consists of all v G V that minimize the relative entropy 
R(y\p) subject to the constraint that 

H(v) = — ^(u, v) = u. 

Let u = — \r 2 . Since V consists of all nonnegative vectors in M q satisfying v\ + - ■ - + v q = 1, the constraint 
set in the minimization problem defining £ u is given by 

C(u) = C{-\r 2 ) = J v € M q : v x > 0,... ,u q > 0, = I>; = r * \ • ( 45 ) 

Geometrically, C(— \r 2 ) is the intersection of the nonnegative orthant of M q , the hyperplane consisting of 
v G TR a that satisfy v\ + ■ ■ ■ + u q = 1, and the hypersphere in lR q with center and radius r. Clearly, 
C(u) ^ if and only if u lies in the range of the energy representation function H(v) = —\{v, v) on V. 
Because < R(v\p) < oo for all v G C(u), the range of ^ on V also equals the set of u for which £ u ^ 0. 

The geometric description of C (u) makes it straightforward to determine those values of u for which 
this constraint set is nonempty. The smallest value of r for which C(—^r 2 ) ^ is obtained when the 
hypersphere of radius r is tangent to the hyperplane, the point of tangency being p = (|,|,...,|), the 
closest probability vector to the origin. The hypersphere and the hyperplane are tangent when r = 4|, 
which coincides with the distance from the center of the hypersphere to the hyperplane. It follows that the 
largest value of u for which C(u) / 0, and thus £ u / 0, is u = — \r 2 = — i. In this case 

C{-^) = {p} = {{\,\,---,\)} = £^- (4-6) 

For all sufficiently large r, C(—^r 2 ) is empty because the hypersphere of radius r has empty intersection 
with the intersection of the hyperplane and the nonnegative orthant of M q . The largest value for r for which 
this does not occur is found by subtracting the two equations defining the hyperplane and the hypersphere. 
Since each z/j G [0,1], it follows that 

< 5>i(l -v i ) = l-r 2 , 
i=i 

and this in turn implies that r 2 < 1. Thus r = 1 is the largest value for r for which C(—^r 2 ) / 0. We 
conclude that the smallest value of u for which C(u) ^ 0, and thus £ u ^ 0, is u = — \r 2 = — |. The set 

£~ 2 consists of the points at which the hyperplane intersects each of the positive coordinate axes; i.e., 

S~\ = {(1, 0, . . . , 0), (0, 1, . . . , 0), . . . , (0, 0, . . . , 1)}. (4.7) 

This completes the proof of part (a) of Theorem l4.2l 

We now determine the form £ u as specified in parts (b)-(d) of Theorem 14.21 Part (b) considers any 
q > 3 and the values u = — ^ and u = — \, part (c) q = 3 and u G (— \ , —^), and part (d) q > 4 and 
u G (— ^, — ^-). Part (b) has already been proved; for u = — ^- and u = — \, the sets £ u are given in d4.6t 
and < HT7b . 

We now consider q > 3 and u G (— |, — ^-). For u eV define 



K{v) = Y J v J and H{u) = -ij^v] . 
3=1 3=1 
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By definition v = (v\, . . . , v q ) G £ u if and only if u minimizes R{v\p) = J2j=i u j ^°s{Q u j) subject to the 
constraints K (v) = 1, H(v) = u, and V\ > 0, . . . , v q > 0. For «£ (— |, — we divide into two parts the 
calculation of the form of v G £ u . First we use Lagrange multipliers to solve the constrained minimization 
problem when ui > 0, . . . , v\ > 0. Then we argue that the vectors v found via Lagrange multipliers solve 
the original constrained minimization problem when v\ > 0, . . . , v q > 0. 

We introduce Lagrange multipliers 7 and A. Any critical point of R[y\p) subject to the constraints 
K{y) = 1, H{v) = u, and u\ > 0, U2 > 0, . . . , v q > satisfies 

' VR(u\p) = jVK(u) + \VH{v) 
K(u) = 1 
H(v) = u 

^- > forj = 1,2, ...,q. 
This system of equations is equivalent to 

--j + Xuj for j = 1,2,..., q 

(4.8) 




By properties of the logarithm, the first equation can have at most two solutions. Hence modulo permuta- 
tions, there exists n G {0,1, ... ,q} and distinct numbers a, b G (0, 1) such that the first n components of 
any critical point v all equal a and the last q — n components of v all equal b. The second and third equations 
in (14. 8I > take the form 

na + (q — n)b = 1 and na 2 + (q — n)b 2 = —2u. (4.9) 

If n = 0, then b = |, while if n = q, then a = |. Both cases correspond to ^ = (|, . . . , |) = p and 
u = — j-, which does not lie in the open interval (— |, — currently under consideration. 
We now consider 1 < n < q — 1. In this case the two solutions of (I4.9t are 

ra- ^n(g-n)(-2gu- 1) g - n + Vn(g - n){-2qu - 1) 

ai(n) = , 61 (n) = ? r , (4.10) 

nq {q — n)q 

and 



n + ^n(g - n)(-2qu - 1) q-n - ^Jn{q - n){-2qu - 1) 

a 2 (nj = , b 2 (ra) = -, s • (4.11) 

nq [q — n)q 

Since u G (—5, — ^), these quantities are all well defined and aj(n) 7^ bj(n) provided u < — j^- 

We now specialize to q = 3, the case considered in part (c) of Theorem 14.21 When q = 3, the interval 
(—|, — equals (—5, — g), and we have n G {1, 2}. Equations (14. 10b and (14.111) take the form 



n — \/n(3 — n)(—6u — 1) 3 — n + ^/n(3 — n)(— 6n — 1) 
° l(n) = 3^ ' h{n) = W^n) 

and 



, . n + \/n(3 — n)(—6u — 1) , . , 3 — n — JniS — n)(—6u — 1) 
° 2(n) = 3^ ' Hn) = W^) 



Any critical point v either has n components equal to a\(ri) and q — n components equal to b\(n) or has n 
components equal to 02(71) and q — n components equal to b^ln). 
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Modulo permutations, the value n = 1 corresponds to 

i/ = ( 0l (l),6i(l),6i(l)) or i/ = (a 2 (l),6 2 (l),6 2 (l)), 
and the value n = 2 corresponds to 

i/=(ai(2),oi(2),6i(2)) or v = (a 2 (2),a 2 (2),6 2 (2)). 

For j E {1,2}, one easily checks that 

oj-(l) = & 3 -i(2) and 0j -(2) = 63^(1). 

Thus, modulo permutation (ai(l),&i(l),&i(l)) = (a 2 (2), a 2 (2), 6 2 (2)) and (a 2 (l), 6 2 (1), 6 2 (1)) = 
(ai(2), ai(2), 0i(2)), and so modulo permutations, n = 1 and n = 2 yield the same points. This shows that 
it suffices to consider only the case n = 1. Since for all u E (—5, — \) 

R{ (a 2 (l), 62(1), 6 2 (1)) \p)<R( Ml), 6i(l), 6i(l))) I p), 

we conclude that modulo permutation v = (a 2 (l), 62(1), ^2(1)) is the unique minimizer of R(y\p) subject 
to the constraints K{v) = 1, H{y) = u, and v\ > 0, i/ 2 > 0, z/3 > 0. 

We now prove for q = 3 that the minimizers found via Lagrange multipliers when v\ > 0, z/ 2 > 0, z/3 > 
also minimize subject to the constraints i^(V) = 1, H(v) = u, and v\ > 0, i/ 2 > 0, ^3 > 0. If 

v = {y\,V2, 1^3) satisfies the constraints and has two components equal to zero, then modulo permutations 
v = (1,0,0) and H(v) = u = —5, which does not lie in the open interval (— |) currently under 
consideration. Thus we only have to consider the case where v has one component equal to zero; i.e, 
v = (0, do, bo) with ao > 60 ■ m this case the second and third equations in ( 14.81) have the solution 

1 + V-4u - 1 , 1 - V-4n - 1 
ao = 2 , 60 = 2 • 

We now claim that modulo permutations the unique minimizer of R{y\p) subject to the constraints K(y) = 
1, H{y) = u, and v\ > 0, v-i > 0,1/3 > has the form (a 2 (l), 6 2 (1), 6 2 (1)) found in the preceding 
paragraph. The claim follows from the calculation 

R{ Ml), &2(1), I P) < R( (0, «o, 60) I p), 

which is valid for all u G (— |, — |). This completes the proof of part (c) of Theorem l4.2l which gives the 
form of v G £ u for q = 3 and u G (— |, — I). 

We now turn to part (d) of Theorem 14.21 which gives the form of £ u for q > 4 and u G (—5, —5^). 
If, as in the case g = 3, we knew that modulo permutations, the minimizers have the form (a, b, ... , 6) as 
specified in Conjecture 14. II then as in the case q = 3 we would be able to derive explicit formulas for these 
minimizers. If Conjecture 14. H is true, then it is easily verified that modulo permutations, £ u consists of the 
unique point v = (a 2 (l), 62(1), • • • , °2(1))> where 02(1) and 62(1) are defined in (14. lit for u G (—5, — j-)- 
This gives part (d) of Theorem l4.2l The proof of the theorem is complete. 

At the end of Section 6 we will see that there exists an explicit value of uq G (—5, — ^) such that 
Conjecture 14.11 is valid for any q > 4 and all u G (— 5,1*0]. Hence for these values of u the form of 
v G £ u given in part (d) of Theorem l4.2l and the formula for s(u) given in part (c) of Theorem 14. 3 1 are both 
rigorously true. 
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V. EQUIVALENCE AND NONEQUIVALENCE OF ENSEMBLES 

As we saw in Section 3, the set £p of canonical equilibrium macrostates undergoes a discontinuous 

phase transition as (3 increases through (3 C = 2 ^Zp log(g — 1), the unique macrostate p bifurcating discon- 

tinuously into the q distinct macrostates v^\f3). By contrast, as we saw in Section 4, the set £ u of micro- 
canonical equilibrium macrostates undergoes a continuous phase transition as u decreases from u c = —j^, 

the unique macrostate p bifurcating continuously into the q distinct macrostates v^\u). The different con- 
tinuity properties of these phase transitions shows already that the canonical and microcanonical ensembles 
are nonequivalent. In this section we study this nonequivalence in detail and relate the equivalence and 
nonequivalence of these two sets of equilibrium macrostates to concavity and support properties of the mi- 
crocanonical entropy s defined in (14.2b . This is done with the help of Figure 2, which is based on the form 
of s in Figure 1 and on the results on ensemble equivalence and nonequivalence in Theorem 15. II In Figures 
3 and 4 at the end of the section we give, for q = 3, a beautiful geometric representation of £ p and £ u that 
also shows the ensemble nonequivalence for a range of u. 

We start by stating in Theorem 15 . 1 1 results on ensemble equivalence and nonequivalence for the Curie- 
Weiss-Potts model. Analogous results are derived in Theorems 4.4, 4.6, and 4.8 in 11511 for a wide range of 
statistical mechanical models, of which the Curie-Weiss-Potts model is a special case. For u G dom s the 
possible relationships between £ u and £p, given in part (a) of Theorem l5.ll are that either the ensembles are 
fully equivalent, partially equivalent, or nonequivalent. Since by part (b) canonical equilibrium macrostates 
are always realized microcanonically and since, by part (a)(iii), microcanonical equilibrium macrostates 
are in general not realized canonically, it follows that the microcanonical ensemble is the richer of the two 
ensembles. 

Theorem 5.1. We define s by (I4.2t and £p and £ u by \23\ and MAX . The following conclusions hold. 

(a) For fixed u G dom s one of the following three possibilities occurs. 

(i) Full equivalence. There exists (3 G JR such that £ u = £p. This is the case if and only if s has a 
strictly supporting line at u with slope (3; i.e., 

s(v) < s(u) + (3{v — u) for all v ^ u. 

(ii) Partial equivalence. There exists (3 G M such that £ u C £p but £ u ^ £p. This is the case if 
and only if s has a nonstrictly supporting line at u with slope (3; i.e., 

s(v) < s(u) + (3{v — u) for all v G JR with equality for some v ^ u. 

(iii) Nonequivalence. For all (3 G M, £ u n £p = 0. This is the case if and only if s has no 
supporting line at u; i.e., for any [3 G JR there exists v such that s(v) > s(u) + (3{v — u). 

(b) Canonical is always realized microcanonically. For v G V we define H(v) = —^(u, v). Then for 
any j3 G M 

£p= (J £\ 

ueH(£p) 

We next relate ensemble equivalence and nonequivalence with concavity and support properties of s in 
the Curie-Weiss-Potts model. For q = 3 an explicit formula for s is given in part (b) of Theorem 14.31 If 
Conjecture 14. H is true, then the formula for s given in part (c) of Theorem l4.3l is also valid for q > 4. All the 
concavity and support features of s are exhibited in Figure 1 . However, this figure is not the actual graph 
of s but a schematic graph that accentuates the shape of s together with the intervals of strict concavity and 
nonconcavity of s. 
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s(u) 




u 

FIG. 1: Schematic graph of s(u), showing the set F = (— |, uo) U {—j^} of full ensemble equivalence, the singleton 
set P = {uq} of partial equivalence, and the set TV = (uq, — ^) of nonequivalence. For u £ F U P — (— -|, Uo] U 
{— i}, s(w) = s**(u); for it € iV, s(u) < s**(u) and the graph of s** consists of the dotted line segment with slope 
(3 C . The slope of s at — A is oo. 



Concavity properties of s are defined in reference to the double-Legendre-Fenchel transform s**, which 
can be characterized as the smallest concave, upper semicontinuous function that satisfies s**(u) > s(u) 
for all u G M \ 10, Prop. A.2]. For u G dom s we say that s is concave at u if s(u) = s**{u) and that s is 
not concave at u if s{u) < s**(u). Also, we say that s is strictly concave at u G doms if s has a strictly 
supporting line at u and that s is strictly concave on a convex subset A of dom s if s is strictly concave at 
each u G A 

According to Figure 1 and Theorem 15. II there exists «o G (— §, — ^) with the following properties. 

• s is strictly concave on the interval (— i, uo) and at the point —j-. Hence for u G F = (— \, uq) U 
{ — t^} the ensembles are fully equivalent rThm. BTlT aXiYl. In fact, for u G (— ^, uq), £ u = Sp with 
/3 given by the thermodynamic formula j3 = s'{u). 

• s is concave but not strictly concave at uq and has a nonstrictly supporting line at uq that also touches 
the graph of s over the right hand endpoint Hence for u = uq the ensembles are partially 
equivalent in the sense that there exists f3 G IR such that 6 U C Ep but £ u ^ £p rThm. l5TT a)(ii)l. In 
fact, P equals the critical inverse temperature f3 c defined in (I3.lt . 

• s is not concave on the interval N = (uq, — ^) and has no supporting line at any u G N JToL 
Thm. A.4(c)]. Hence for u G N the ensembles are nonequivalent in the sense that for all j3 G M, 
£ u n^ = [Thm.|5lla)(iii)]. 

We point out two additional features of Figure 1. First, although £ u ^ for u equal to the right hand 
endpoint — ^ of doms, we do not include this point in the set F of full ensemble equivalence. Indeed, s 
is not strictly concave at — ^ because there is no strictly supporting line at — i; as one can see in (I5.lt . the 
slope of s at —i is oo. Nevertheless, by introducing the limiting set 

£oo = {(1,0,...,0),(0,1,...,0),...,(0,0,...,1)}= lim £ p , 

we can extend full ensemble equivalence to u = — \ since £~s = £ oc . 

Second, for u in the interval N of ensemble nonequivalence, the graph of s** is affine; this is depicted 
by the dotted line segment in Figure 1 . The slope of the affine portion of the graph of s** equals the critical 
inverse temperature (3 C defined in (I3.lt . This can be proved using concave-duality relationships involving 
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FIG. 2: For q = 8 the top right plot shows Ep, the top left plot the graph of s'(u) for u € doms = [«£,it c ] = 
[— i, — -^], and the bottom left plot £ u . The discontinuous phase transition at j3 c in the top right plot and the continu- 
ous phase transition at u c in the bottom left plot imply that the ensembles are nonequivalent for all u E N = (uq , u c ). 
On this interval s is not concave and s** is affine with slope (3 C . The shaded area in the bottom left plot corresponds 
to the region of nonequivalence of ensembles delineated by u G N. 



s** and the canonical free energy. The quantity (3 C also satisfies an equal-area property, first observed by 
Maxwell [28, p. 45] and explained in the context of another spin model in Jlil p. 535]. 

The relationships stated in the three bulleted items above give valuable insight into equivalence and 
nonequivalence of ensembles in the Curie-Weiss-Potts model. These relationships are illustrated in Figure 
2. In this figure we exhibit the graph of s' and the sets £p and £ u in order to compare the phase transitions 
in the two ensembles and to understand the implications for ensemble equivalence and nonequivalence. In 
order to accentuate properties of s', £p, and £ u that are related to ensemble equivalence and nonequivalence, 
we focus on q = 8. In presenting the graph of s' and the form of £ u , we assume that for q = 8 Conjecture 
14- 1 1 is valid. We then appeal to part (c) of Theorem l4.3l which gives an explicit formula for s, and to part 
(d) of Theorem 14.21 which gives an explicit formula for the elements of £ u . The derivative s', graphed in 
the top left plot in Figure 2, is given by 



s'(u) 



1 



-2qu - 1 



log 1 + J( q -l)(-2qu-l) -log 1 



-2qu - 1 
9-1 



(5.1) 



The canonical phase diagram, given in the top right plot in Figure 2, summarizes the description of £p 
given in Theorem 13.11 and shows the discontinuous phase transition exhibited by this ensemble at j3 c = 
2<y ^Z^} log(q — 1) = | log 7. The solid line in this plot for (3 < f3 c represents the common value | of each 
of the components of p, which is the unique phase for j3 < j3 c . For (3 > (3 C there are eight phases given 
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by z^ 1 (/3) together with the vectors v l ((3) obtained by interchanging the first and ith components of 
Finally, for (3 = (3 C there are nine phases consisting of p and the vectors v l {j3 c ) for i = 1,2, ... ,8. The 
solid and dashed curves in the top right plot in Figure 2 show the first component and the last seven, equal 
components of f 1 {f3) for (3 G [/3 C , oo). The first component is a strictly increasing function equal to | 
for f3 = f3 c and increasing to 1 as (3 — > oo while the last seven, equal components are strictly decreasing 
functions equal to ^ for (3 = (3 C and decreasing to as j3 —* oo. 

The microcanonical phase diagram, given in the bottom left plot in Figure 2, summarizes the description 
of £ u given in Theorem 14.21 and shows the continuous phase transition exhibited by this ensemble as n 
decreases from the maximum value u c = = — jq- The single phase p for u = — ^ is represented 

by the point lying over this value of u. For n G [— |, — jg) there are eight phases given by ^(n) together 
with the vectors v l {u) obtained by interchanging the first and ith components of The solid and 

dashed curves in the bottom left plot in Figure 2 show the first component a(u) and the last seven, equal 
components b(u) of ^(n) for n G [— |, — A). The first component is a strictly increasing function of — -u 
equal to | for u = — ^ and increasing to 1 as u — > — 5, while the last seven, equal components are strictly 
decreasing functions of — n equal to | for n = — i and decreasing to as n — > — 5. 

The different nature of the two phase transitions — discontinuous in the canonical ensemble versus 
continuous in the microcanonical ensemble — implies that the two ensembles are not fully equivalent for all 
values of u. By necessity, the set £p of canonical equilibrium macrostates must omit a set of microcanonical 
equilibrium macrostates. Further details concerning ensemble equivalence and nonequivalence can be seen 
by examining the graph of s', given in the top left plot of Figure 2. This graph, which is the bridge between 
the canonical and microcanonical phase diagrams, shows that s' is strictly decreasing on the interval int F = 
(— |, Uq), which is the interior of the set F of full ensemble equivalence. The critical value [3 C equals the 
slope of the affine portion of the graph of s** over the interval N = (no, — j-) of ensemble nonequivalence. 
This affine portion is represented in the top left plot of Figure 2 by the horizontal dashed line at f3 c . 

Figure 2 exhibits the full equivalence of ensembles that holds for n G intF = (— i uq) rThm. l6!2T a)l. 
For u in this interval the solid and dashed curves representing the components of ^(n) G £ u can be put 
in one-to-one correspondence with the solid and dashed curves representing the same two components of 
G £f3 for G (p c ,oo). The values of u and (3 are related by s'(u) = (3. Full equivalence of 
ensembles also holds for u = — G F, the right-hand endpoint of the interval on which s is finite. The 
solid vertical line in the top right plot for [3 < (3 C , which represents the unique phase p, is collapsed to 
the single point representing the unique phase p for u = — i in the bottom left plot. This collapse shows 
that the canonical notion of temperature is somewhat ill-defined at u = — ^- since lowering (3 down to (3 C 
changes neither the equilibrium macrostate p nor the associated mean energy u. This feature of the Curie- 
Weiss-Potts model is not present, for example, in the mean-field Blume-Emery-Griffiths spin model, which 
also exhibits nonequivalence of ensembles 1 1811 . 

By comparing the top right and bottom left plots, we see that the elements of £ u cease to be related to 
those of £p for u G N = (uo, — which is the interval on which s is not concave. For any mean-energy 
value it in this interval no v G £ p exists that can be put in correspondence with an equivalent equilibrium 
empirical vector contained in £ u . Thus, although the equilibrium macrostates corresponding to u G N are 
characterized by a well defined value of the mean energy, it is impossible to assign an inverse temperature (3 
to those macrostates from the viewpoint of the canonical ensemble. In other words, the canonical ensemble 
is blind to all mean-energy values u contained in the interval N of nonconcavity of s. This is closely related 
to the presence of the discontinuous phase transition seen in the canonical ensemble. 

The quantity no defined in d6.2t plays a central role in the analysis of phase transitions and ensemble 
equivalence in the Curie-Weiss-Potts model. First, as we saw in our discussion of Figure 1, uq separates the 
interval (— |, u Q ) of full ensemble equivalence from the interval (u , — i) of nonequivalence. Second, part 

(a) of Lemma ETT1 shows that no equals the limiting mean energy H(u l ((3 c )) in the canonical equilibrium 
macrostate as (3 — > {(3 C ) + . In Figures 3 and 4 we present for q = 3 a third, geometric interpretation 
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C 




FIG. 3: Graphical representation of the set £p of canonical equilibrium macrostates for q = 3 showing the maximal 
circle of intersection corresponding to u = uo\ the vector p; the unit-coordinate vectors A, B, and C; and the 
macrostates A c = ^ 1 (/? c ), B c = v 2 ((3 c ), and C c = v 3 (f3 c ). The line segments A C A, B C B, and C C C represent the 
elements of £p for (3 > f3 c . 



of uq that is also related to nonequivalence of ensembles. 

Before explaining this third, geometric interpretation of uq, we recall that according to part (a) of 
Theorem 14.21 £ u is nonempty, or equivalently the constraint set in (14.51) is nonempty, if and only if 
u £ [— I,— = [— 5,— g]. Geometrically, the energy constraint H{v) = — ^(u, u) = u corresponds 
to the sphere in M 3 with center and radius \J — 2u. This sphere intersects the set V of probability vectors 
if and only if u E [ — ^, — For u = — g, the sphere is tangent to V at the unique point p while for 
u = — i, the hypersphere intersects V at the q unit-coordinate vectors. The intersection of the sphere and 
V undergoes a phase transition at uq in the following sense. For u £ [tin, — g) the sphere intersects V in a 
circle while for it £ [— |, mq), the sphere intersects P in a proper subset of a circle; the complement of this 
subset lies outside the nonnegative octant of M 3 . For u = Uq = — \, the circle of intersection is maximal 
and is tangent to the boundary of V. 

The set £p of canonical equilibrium macrostates for q = 3 is represented in Figure 3. In this figure the 
maximal circle of intersection corresponding to u = no = — \ is shown together with the vector p at its 
center; the points A, B, and C representing the respective unit-coordinate vectors (1,0,0), (0, 1,0), and 
(0, 0, 1); and the points A c , B c , and C c representing the respective equilibrium macrostates z/ 1 (/3 c ), v 2 {(5 c ), 
and v 3 (f3 c ). These three macrostates lie on the maximal circle of intersection since H{v 1 (f3 c )) = uq [Lem. 
16-ir bYl. For > /3 C all v £ £p have two equal components, and as — > oo these vectors converge to the 
unit-coordinate vectors A, B, and C. Hence for f3 > j3 c the equilibrium macrostates z^ 1 (/3), v 2 ((3), and 
v 3 (f3) are represented by the open line segments A C A, B C B, and C C C. 

The set £ u of microcanonical equilibrium macrostates for q = 3 is represented in Figure 4. In this figure 
the maximal circle of intersection corresponding to u = u$ = — \ is shown together with the vector p 
at its center; the points A, B, and C representing the unit-coordinate vectors; and the points Aq, Bq, and 
Co representing the respective equilibrium macrostates v 1 {uq), v 2 (uq), and v 3 (uq). For u £ (— — g) 
all v £ £ u have two equal components, and as u — > — i they converge to the unit coordinate vectors A, 
B, and C. Hence for u £ (—5, — g) the equilibrium macrostates i^ 1 (u), ^ 2 (u), and z^ 3 (u) are represented 
by the open line segments pA, pB, and pC. As we saw in the preceding section, for each u £ (— |, —4) 
the macrostates z^ 1 (u), z^ 2 (u), and i/ 3 (n) lie on the intersection of the sphere of radius \J—2u with V. In 
particular, Aq = ^(uo), Bo = v 2 {uq), and Co = v 3 (u$) lie on the maximal circle of intersection. 

The distinguishing feature of Figure 4 is the three open dashed-line segments pAo, pB$, and pCo 
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FIG. 4: Graphical representation of the set £ u of microcanonical equilibrium macrostates for q = 3 showing the 
maximal circle of intersection corresponding to u = uq; the vector p ; the unit-coordinate vectors A, B, and C; and 
the macrostates Ao = ^(uq), Bq — v 2 (uo), and Co = v 3 (uo). The solid-line segments AqA, BqB, and CqC 
represent the elements of £ u that are realized canonically. The dashed-line segments pAo, pBo, and pCo represent 
the elements of £ u that are not realized canonically. 



representing the elements of £ u that are not realized canonically; namely, v l {u), v 2 {u), and v 3 (u) for 
u G (no, — |). The three half open solid-line segments AqA, BqB, and CqC represent the elements of £ u 
that are realized canonically; namely, z^ 1 (u), v 2 (u), and v 3 (u) for u G (— |, «rj]. For each such u the value 
of /? for which £ u = 5^ is determined by the equation H(v l (f3)) = u rThm. l6~2T aYl. Thus in Figure 3 the 
corresponding elements of £p lie on the intersection of the sphere of radius \J—2u and V. 

This completes our discussion of equivalence and nonequivalence of ensembles. In the next section we 
will prove a number of statements concerning ensemble equivalence and nonequivalence that have been 
determined graphically. 



VI. PROOFS OF EQUIVALENCE AND NONEQUIVALENCE OF ENSEMBLES 

Using the general results of , we stated in the preceding section the equivalence and nonequivalence 
relationships that exist between £ u and £p and verified these relationships using the plots of these sets for 
q = 8 given in Figure 2. Our purpose in the present section is to prove these relationships using mapping 
properties of the mean energy function u{j3) defined for f3 ^ j3 c by 

u(B) = { * {p) = ~i f ° r * < ^ (6 1) 

Here is the unique canonical equilibrium macrostate modulo permutations for f3 > j3 c rThm. l3~T1l . 

According to the next lemma, for (3 > f3 c , u((3) is continuous and strictly decreasing and u{(3) < 
which equals the mean energy for j3 < (3 C . It follows that as /? increases through f3 c , u([3) is discontinuous, 
jumping down from — ^ to H(v l ([3)). This discontinuity in u((3) mirrors in a natural way the discontinuity 
in £p as (3 increases through (3 C - 

Lemma 6.1. For f3 G \fl c , oo) we define ^(Z?) as in part (b) of Theorem 13 . 1 1 an d we define 

-q 2 + 3g - 3 

U0= 2q(q-l) ' (6 " 2) 
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The following conclusions hold. 

(a) -\ < u < -± and]imp^ Pc )+ u{(3) = F (z^(/? c )) = u . 

(b) The function mapping 

(3 G oo) » u(/3) = H(v\(3)) = ^(/3)) 
is a strictly decreasing, dijferentiable bijection onto the interval (— i, Uo). 

Proof, (a) The inequalities involving uo follow immediately from the inequality q > 3. The rela- 
tionship iJ(^ 1 (/3 c )) = «o is easily determined using the explicit form of z^ 1 (/3 c ) given in (13.5b . That 
lirng_»( / g c )+ ti(/3) = H(v 1 (f3 c )) follows from the definition of and the continuity of v 1 ((3) for (3 > /3 C . 
(b) For G (/3 C , oo) we use the formula for v l {(3) given in part (b) of Theorem B.ll to calculate 

[1 + (g - 1M/3)] 2 [l-^g A 

2 + (? ~ !) 2 • 

Since w(/3) is positive, strictly increasing, and differentiable for j3 G (/3 C , oo) IThm l3~TT a)l and since 

vtifi) = Z^M^m < for (3 G (ft, oo), 

tt(/3) is strictly decreasing for (3 G (f3 c , oo). In addition, since lim^oo w{j3) = 1 rThm. l3~lT a)l. we have 
lim^oo u((3) = — |, and by part (a) of this lemma 

lim u(/3) = lim H(v(B c )) = un. 

It follows that the function mapping (3 G (/3 C , oo) i— > u(/3) is a strictly decreasing, differentiable bijection 
onto the interval (— ~, — ^«o)- This completes the proof of part (b). ■ 

Mapping properties of u((3) play an important role in the next theorem, in which we prove that the sets 
F, P, and N defined in (I6.3t correspond to full equivalence, partial equivalence, and nonequivalence of 
ensembles. For u G F we consider three subcases in order to indicate the value of (3 for which £ u = £»; 
for u G mtF = (— |, uq), (3 and u are related by (3 = s'(u) and u = u((3). Part (c) shows an interesting 
degeneracy in the equivalence-of-ensemble picture, the set £ u for u = — ^- corresponding to all £@ for 
/3 < [3 C . This is related to the fact that for all such values of (3, Bp = {p} and thus the mean energy u{(3) 
equals — ^. 

Theorem 6.2. We define s{u) in d4.2t . u(/3) in d6.il ). in d2.3t . and £ u in d2.4t . We a/so define (3 C in d3.lt 
awd" no in d6.2t . 77je sets 

F = (-|, u ) U {-^}, P = {-uo}, and 7Y = (-|u , -^) (6.3) 

have the following properties. 

(a) Full equivalence on int F. For u G int F = (— ^, no), there exists a unique (3 G ((3 C , oo) such that 
= £p; p sat i s fi es = H(u l {f3)) = u. 

(b) For u G int F = (—^,uq), s is differentiable. The values u and (3 for which £ u = £p in part (a) are 
also related by the thermodynamic formula s'{u) = (3. 

(c) Full equivalence at — For u = £ F, £ 2 i = £pfor any (3 < (3 C . 

(d) Partial equivalence on P. ForueP = {u }, £ u ° C £ Pc but £ u ° ^ £ Pc . In fact, % = £ Ua U£~^i. 

(e) Nonequivalence on N. For any u G N = (uq, —^-),£ u r\£p = for all (3 G M. 
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In reference to the properties of s given in part (b), one can show that the function mapping u G 
(—h,Uo) i— > s'(u) is a strictly decreasing, differentiable bijection onto the interval (/3 c ,oo) and that this 
bijection is the inverse of the bijection mapping f3 G (/? c , oo) i— » u(f3). 

Before we prove the theorem, it is instructive to compare its assertions with those in Theorem l5.ll which 
formulates ensemble equivalence and nonequivalence in terms of support properties of s. These support 
properties can be seen in the schematic plot of the the graph of s in Figure 1. We start with part (a) of 
Theorem 16.21 which states that for any u G int F = (—^,uq) there exists a unique (3 G {(3 C , oo) such that 
£ u = 6 p. As promised in part (a)(i) of Theorem 15. II this f3 is the slope of a strictly supporting line to the 
graph of s at u. The situation that holds when u = — ^ [Thm. I6.2f c)l is also consistent with part (a)(i) of 
Theorem 15.11 For this value of u, which is the isolated point of the set F of full equivalence, there exist 
infinitely many strictly supporting lines to the graph of s, the possible slopes of which are all G (— oo, (3 C ). 
On the other hand, when u = uq, which is the only value lying in the set P of partial equivalence, we have 
£ u ° C £p c but £ u ° / £/3 c IThm. Od)l. In combination with part (a)(ii) of Theorem O it follows that 
there exists a nonstrictly suppporting line at u with slope (3 C . Finally, for u G N = (uq, — ^z), we have 
£ u C\ £p = for all f3 G M IThm l6~2l e)1. In accordance with part (a)(iii) of Theorem 15. 11 s has no 
supporting line at any u G N, and by Theorem A.4 in [ 10] s is not concave at any u£JV. 

Proof of Theorem l6.2l (a) For (3 > (3 C part (b) of Theorem B.ll and part (b) of Theorem 15.11 imply that 

£p = {v\P),...,v«{P)}= |J £ u . 

The symmetry of H with respect to permutations implies that H{£p) = {H{v l {(3))}. Thus for any (3 > (3 C 

£ f3 = £^ 1 W). (6 .4) 

Since for any u G intF = (— ^,uq) there exists a unique (3 G (/3 c ,oo) satisfying u{(3) = H{v 1 (f3)) = u 
rLem.lObVl. it follows that £ u = £p. 

(b) According to part (b) of Theorem 16.31 s is differentiable at all u G int F. Since s = s** in a 
neighborhood of each such u, part (a) of Theorem A.3 in [10] implies that s'(u) = (3. 

(c) By (US and part (b) of Theorem O 

£~k = {p} = £ p for any (3 < (3 C . (6.5) 

(d) By part (b) of Theorem 13. 11 symmetry, and part (a) of Lemma l6~T1 

H(£ Pc ) = {H{p),H(v l {f3 c ))} = {-±u }. 

Hence by (16.41) and d6.5fc 

£ Pc = (J £ u = £~h UT° = {p}U£ U0 . 

However, p ^ £ u ° since p does not satisfy the constraint H(p) = uq. It follows that £ u ° C £p c but that 

£ u «+£p c . ' 

(e) If u G N, then u ^ (— ^, uq), and so by part (a) of Lemma l6.H u ^ H{v 1 {f3)) for any (3 G (/3 C , oo). 
Since by <E3 £p = £ n ^W) for all (3 > (3 C , it follows that for all > (3 C 

_ j_ 

and thus that £ u n £p = 0. For any (3 < (3 C d63t states that £p = £ 2 i = {p}. Since u G N, we have 

u / -i and thus n £ u = 0. It follows that £ u n £p = for any (3 < (3 C . Finally, for (3 = (3 C 
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part (b) of Theorem 13 . 1 1 states that £p c = {p, ^ 1 (/3 C ), . . . , v q {(3 c )}. However, since H(p) = — -h £ N and 

H(is l (f3 c )) = uq £ N, none of the vectors in £p c satisfies the constraint H{v) = u . Thus £ u n £p c = 0. 
We have proved £ u PI £n = for all f3 G M. The proof of the theorem is complete. ■ 

We end this section by showing that for arbitrary q > 4 and u in the equivalence sets FLIP = (— | , uq] U 
{— t^} the formulas for £ u and s(it) given in part (d) of Theorem 14.21 and part (c) of Theorem 14.31 are 
rigorously true. Our strategy is to use the equivalence of the microcanonical and canonical ensembles for 
u G FUP and the fact that the form of £p is known exactly for all j3. Thus, we translate the form of v G £p, 
as given in part (b) of Theorem 13. II into the form of v G £ u for u G F U P. For j3 G [(3 C , oo), the last q—1 
components of G are given by 

v}W = (6-6) 

and these components are not equal to the first component. Since for each u G FUP there exists j3 G [f3 c , oo] 
such that either £ u = £p or £ u G £p, it follows that modulo permutations all v G £ u have their last g — 1 
components equal to each other. That is, modulo permutations there exist numbers a and b in [0, 1] such 
that v = (a, b, . . . , b). The possible values of a and b are easily determined by considering the constraints 
satisfied by v G £ u . These constraints are 

a + (q - 1)6 = 1 and a 2 + (q - l)b 2 = -2u. 

The two solutions of these equations are 



l-y/(q-l)(-2qu-l) q-l + y/(q-l)(-2qu-l) 
a\ = , t>i = 



(?-!)« 



and 



1 + V( 9 -l)(-2gu-l) t _ g - 1 - y/(q-l)(-2qu-l) 



q ' (q 

Of the two values b\ and 62, only 62 has the form given in (16.61) with 



«(ffl = vSE2E^E2 6(0 ,i]. 

9-1 

We conclude that modulo permutations each 1/ G <S" has the form (02, 62, . . . , 62)1 in which the last q — 1 
components all equal 62. This coincides with the formula for v (u) given in part (d) of Theorem l4.2l which 
in turn gives the explicit formula for s(u) in part (c) of Theorem 14.31 This information is summarized in 
part (a) of the next theorem. The differentiability of s on int F, which is stated in part (b), is an immediate 
consequence of the explicit formula for s(u). 



Theorem 6.3. We define u$ in (I6.2I) . The following conclusions hold. 

(a) For arbitrary q > 4 and u in the equivalence sets F U P = (—5, uq[ U { — ^-} the formulas for £ u 
and s(u) given in part (d) of Theorem \4.2\ and part (c) of Theorem \4.3\ are rigorously true. 

(b) For arbitrary q > 4, s is differ entiable on the interval int F = (— i, uq) and s'(u) is given by (I5.lt . 
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APPENDIX A: TWO RELATED MAXIMIZATION PROBLEMS 

Theorem lA.ll is a new result on the maximum points of certain functions related by convex duality. It is 
formulated for a finite, differentiable, convex function F on JR a and its Legendre-Fenchel transform 

F*(z) = sup {(x,z) -F(x)}. 

With only minor changes in notation the theorem is also valid for a finite, Gateaux-differentiable, convex 
function on a Hilbert space. 

Theorem lA.il will be applied in Appendix B to prove that for j3 > 0, £p has the form given in part (b) 
of Theorem 13. II Another application of Theorem IA. H is given in Proposition 3.4 in 11711 . It is used there 
to determine the form of the set of canonical equilibrium macrostates for another important spin system 
known as the mean-field Blume-Emery-Griffiths model. 

Theorem A.l. Let a be a positive integer and F a finite, differentiable, convex function mapping M a into 
M. Assume that sup zg j RCT {i ? (z) — \ ||^|| 2 } < oo and that F(z) — ^\\z\\ 2 attains its supremum. The following 
conclusions hold. 

(a) sup {F(z) - \\\z\\ 2 } = sup {\\\z\\ 2 - F*(z)}. 

zeM" zedomF* 

(b) ^||z|| 2 — F*(z) attains its supremum on dom F*. 

(c) The global maximum points of F(z) — ^\\z\\ 2 coincide with the global maximum points of ^\\z\\ 2 — 
F*(z). 

Proof. We define the subdifferential of F* at zq G M a by 

dF*{z ) = {y£M a : F*(z) > F*(z ) + {y,z- z ) for all z G M a }. 

We also define the domain of dF* to be the set of zq G M a for which 8F*(zq) ^ 0. The proof of the 
theorem uses three properties of Legendre-Fenchel transforms. 

1. F* is a convex, lower semicontinuous function mapping JR a into M U {oo}, and for all z € TR° ', 
F**(z) = (F*)*(z) equals F(z) Q Thm. VI.5.3(a),(e)]. 

2. If for some z G M a and z £ M a we have z = VF(zo), then F(z ) + F*(z) = {zq, z), and so 
z G domF*. In particular, if z = z , then z G domF* and F(z ) + F*(z ) = \\zo\\ 2 - 

3. For z G domF* and y G dF*(z ) we have F(y) + F*(z ) = (y, z ) \ 14, Thm. VI.5.3(c),(d)]. In 
particular, if y = zq, then F(zq) + ^(^o) = ||zo|| 2 . 

We first prove part (a), which is a special case of Theorem C.l in Q. LetM — SUp^gjjo- {F(z) - 
\\z\\ 2 /2}. Since for any z G domF* and x in M a 

F*(z) + M > (x, z) - F(x) + M > (x,z) — \\x\\ 2 /2, 

we have 

F*(z) + M> sup {{x, z) - \\x\\ 2 /2} = \\z\\ 2 /2. 

It follows that M > \\z\\ 2 /2 - F*(z) and thus that M > sup zGdomF ,{||z|| 2 /2 - F*(z)}. To prove the 
reverse inequality, let N = sup zedom p«{||z|| 2 /2 — F*(z)}. Then for any z G JR a and x G domF* 

IM| 2 /2 + N > (x,z) - \\x\\ 2 /2 + N > (x, z) - F*(x). 
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Since F*(x) = oo for x G" domF*, it follows from property 1 that 

\\z\\ 2 /2 + N> sup {(x,z) - F*(x)} = F(z) 

rr£dom F* 

and thus that N > sup z£Ra {F(z) - ||z|| 2 /2}. 

In order to prove parts (b) and (c) of Theorem lA.il let zq be any point in M a at which F(z) — ^\\z\\ 2 
attains its supremum. Then zq = VF(zo), and so by the last line of property 2, zq G dom F* and F{zq) + 
F*(zq) = || zq II 2 . Part (a) now implies that 

sup{F(z)-±||z|| 2 } = F{zq) - \\\z \\ 2 



= l\\z \\ 2 -F*(z ) = 

zGdomF 



±\\z \\ 2 -F*(z ) = sup {±\\z\\ 2 -F*(z)}. 



We conclude that ^||.z|| 2 — F*(z) attains its supremum on domF* at zq. Not only have we proved part 
(b), but also we have proved half of part (c); namely, any global maximizer of F(z) — ^\\z\\ 2 is a global 
maximizer of i||z|| 2 — 

Now let zq be any point at which |||;z|| 2 — F*(z) attains its supremum. Then for any z G M a 

i(z ,z ) -F*(z ) > \{z,z)-F*(z). 

It follows that for any z G ffi 7 

F*(z)>F*(z ) + ±((z,z)-(z ,z ))>F*(z ) + (z ,z-z ) 

and thus that z$ G dF*(z$). By the last line of property 3 this implies that F(zq) + F*(zq) = ||zo|| 2 - m 
conjunction with part (a) this in turn implies that 

sup {i||z|| 2 -F*(z)} = i\\z \\ 2 -F*(z ) 

zGdomF* 

= F{z ) - ±\\z \\ 2 = sup{F(z)-i||z|| 2 }. 
We conclude that F(z) — ^||z|| 2 attains its supremum at zq. This completes the proof of the theorem. ■ 

APPENDIX B: FORM OF £ p 

We first derive the form of £p for (3 > as given in part (b) of Theorem 13.11 We then prove that 
S p = {p} for all < 0. 

Bp is defined as the set ofv£V that minimize R(y\p) — v). Since (5 > 0, this is equivalent to 

£p = |i/ G V : v maximizes v) — ^i?(z/|p)| . (B.l) 
This maximization problem has the form of the right hand side of part (a) of Theorem lA.il viz., 

S uv{l(v,v)-jiR(v\p)}= sup {\\\vf-F*{v)} 

with F*{u) = ^R{v\p). For z G we define the finite, convex, continuous function 

r(z)=log(n=ie 2 4). (B.2) 
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Since for v e W Q Thm. VIII. 2. 2] 

(T)» 



i?(z/|p) for v £V 
oo otherwise , 



it follows that for z £ M q 

F(z) = sup {(z, u) - lR{v\p)\ = 4 sup {</3z, i/) - = 4r(/3z). 

Thus by part (a) of Theorem lA.il 



sup |4r(^) - ±||z|| 2 } = sup i\{v,u) - lR(y\p)} 



and by part (b) of the theorem the global maximum points of the two functions coincide. 
Equation dB.lt now implies that 



Bp = 1 2 G M g : z maximizes ^T(0z) - ^\\z\\ 2 } 
= jz G M g : z minimizes § ||z|| 2 - T(fiz)} . 



We summarize this discussion in the following corollary. Part (b) of the corollary is proved in part (b) of 
Theorem 2.1 in 



Corollary B.l. We define the finite, convex, continuous function T in dB.2t . The following conclusions hold. 

(a) £p coincides with the set of global minimum points of 

G p {z) = f |M| 2 -log^y* = f ||*f_r09z)-logg. 

i=i 

(b) For < (3 < (3 C , = P c , and f3 > f3 c the set of global minimum points of Gp has the form given by 
the right hand side of ( 13.41 ) [Thm. I3.11 b)1. 

Corollary IB . ll completes the proof of Theorem l3.ll Michael Kiessling's proof of this corollary based on 
Lagrange multipliers is given in Appendix B of 12011 . Continuous analogues of the corollary are mentioned 
in lil], 0], and S, but are not proved there. 

We now show that for all j3 < 0, £p = {p}. This is obvious for (3 = since v = p is the unique vector 
in V that minimizes R{y\p). Our goal is to prove that for f3 < 0, v = p is also the unique vector in V that 
minimizes R{y\p) — &(v, v). Let v be a point in V at which i?(^|/o) — ^(u, v) attains its infimum. For any 
% = 1,2, ...,q, 

d{R{u\p)-^u,u)) 

— * = log Vi + 1 - 0v h 

OVi 

which is negative for all sufficiently small Uj > 0. It follows that v does not lie on the relative boundary 
of V\ i.e., Vj > for all i = 1, 2, ... , q. We complete the proof by showing that for any 1 < j < k < q, 
Vj = Vk- Since p is the only point in V satisfying these equalities, we will be done. 

Given a £ (0, 1), we consider the reduced two-variable problem of minimizing R(v\p) — f (u, v) over 
Vj > and v & > under the constraint uj + = a; all the other components V{ are fixed and equal z/j. 
Setting Uk = a — Uj, we define 

F(v j )=R(v\p)-%{v,v). 
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Differentiating with respect to v j shows that any global minimizer Uj must satisfy 

F'(vj) = \ogVj — log(a — Vj) — (3(2vj — a) = 0. 

Since 

= £ + 1^7 - 20 > 0, 

F'(uj) is strictly increasing from negative values for all Vj near to positive values for all uj near a. It 
follows that the only root of F'{vj) = is Uj = | and thus that = | = Uj. Being a global minimizer of 
R{v\p) — %(v, v) over V, v is also a global minimizer of the reduced two-variable problem. Since a € (0, 1) 
is arbitrary, it follows that for any distinct pair of indices 9j = v^. This completes the proof. 
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